audio sample
Physics-Guided Deepfake Detection for Voice Authentication Systems
Mohammadi, Alireza, Sood, Keshav, Thiruvady, Dhananjay, Nazari, Asef
Abstract--V oice authentication systems deployed at the network edge face dual threats: a) sophisticated deepfake synthesis attacks and b) control-plane poisoning in distributed federated learning protocols. We present a framework coupling physics-guided deepfake detection with uncertainty-aware in edge learning. The representations are then processed via a Multi-Modal Ensemble Architecture, followed by a Bayesian ensemble providing uncertainty estimates. Incorporating physics-based characteristics evaluations and uncertainty estimates of audio samples allows our proposed framework to remain robust to both advanced deepfake attacks and sophisticated control-plane poisoning, addressing the complete threat model for networked voice authentication. DV ANCED neural speech deepfake generation has fundamentally transformed voice authentication security.
- Oceania > Australia (0.04)
- North America > United States (0.04)
HarmonicAttack: An Adaptive Cross-Domain Audio Watermark Removal
Li, Kexin, Hu, Xiao, Grishchenko, Ilya, Lie, David
The availability of high-quality, AI-generated audio raises security challenges such as misinformation campaigns and voice-cloning fraud. A key defense against the misuse of AI-generated audio is by watermarking it, so that it can be easily distinguished from genuine audio. As those seeking to misuse AI-generated audio may thus seek to remove audio watermarks, studying effective watermark removal techniques is critical to being able to objectively evaluate the robustness of audio watermarks against removal. Previous watermark removal schemes either assume impractical knowledge of the watermarks they are designed to remove or are computationally expensive, potentially generating a false sense of confidence in current watermark schemes. We introduce HarmonicAttack, an efficient audio watermark removal method that only requires the basic ability to generate the watermarks from the targeted scheme and nothing else. With this, we are able to train a general watermark removal model that is able to remove the watermarks generated by the targeted scheme from any watermarked audio sample. HarmonicAttack employs a dual-path convolutional autoencoder that operates in both temporal and frequency domains, along with GAN-style training, to separate the watermark from the original audio. When evaluated against state-of-the-art watermark schemes AudioSeal, WavMark, and Silentcipher, HarmonicAttack demonstrates greater watermark removal ability than previous watermark removal methods with near real-time performance. Moreover, while HarmonicAttack requires training, we find that it is able to transfer to out-of-distribution samples with minimal degradation in performance.
- Media (1.00)
- Information Technology > Security & Privacy (1.00)
New AI technique sounding out audio deepfakes
Researchers from Australia's national science agency CSIRO, Federation University Australia and RMIT University have developed a method to improve the detection of audio deepfakes. The new technique, Rehearsal with Auxiliary-Informed Sampling (RAIS), is designed for audio deepfake detection -- a growing threat in cybercrime risks such as bypassing voice-based biometric authentication systems, impersonation and disinformation. It determines whether an audio clip is real or artificially generated (a'deepfake') and maintains performance over time as attack types evolve. In Italy earlier this year, an AI-cloned voice of its Defence Minister requested a €1M'ransom' from prominent business leaders, convincing some to pay. This is just one of many examples, highlighting the need for audio deepfake detectors.
- North America > United States (0.14)
- Europe > France > Île-de-France > Paris > Paris (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
- North America > United States > California > Santa Clara County > Sunnyvale (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- Asia (0.04)
Fine-tuning Pre-trained Audio Models for COVID-19 Detection: A Technical Report
de Brito, Daniel Oliveira, de Souza, Letícia Gabriella, Gauy, Marcelo Matheus, Finger, Marcelo, Junior, Arnaldo Candido
This technical report investigates the performance of pre-trained audio models on COVID-19 detection tasks using established benchmark datasets. We fine-tuned Audio-MAE and three PANN architectures (CNN6, CNN10, CNN14) on the Coswara and COUGHVID datasets, evaluating both intra-dataset and cross-dataset generalization. We implemented a strict demographic stratification by age and gender to prevent models from exploiting spurious correlations between demographic characteristics and COVID-19 status. Intra-dataset results showed moderate performance, with Audio-MAE achieving the strongest result on Coswara (0.82 AUC, 0.76 F1-score), while all models demonstrated limited performance on Coughvid (AUC 0.58-0.63). Cross-dataset evaluation revealed severe generalization failure across all models (AUC 0.43-0.68), with Audio-MAE showing strong performance degradation (F1-score 0.00-0.08). Our experiments demonstrate that demographic balancing, while reducing apparent model performance, provides more realistic assessment of COVID-19 detection capabilities by eliminating demographic leakage - a confounding factor that inflate performance metrics. Additionally, the limited dataset sizes after balancing (1,219-2,160 samples) proved insufficient for deep learning models that typically require substantially larger training sets. These findings highlight fundamental challenges in developing generalizable audio-based COVID-19 detection systems and underscore the importance of rigorous demographic controls for clinically robust model evaluation.
- South America > Brazil > São Paulo (0.05)
- North America > Canada (0.04)
- Europe > Switzerland > Vaud > Lausanne (0.04)
- North America > United States > New Hampshire (0.04)
- Europe > France (0.04)
- Asia > Taiwan (0.04)
- (2 more...)
- Media (1.00)
- Information Technology > Security & Privacy (1.00)
- North America > United States (0.14)
- Europe > France > Île-de-France > Paris > Paris (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
- North America > United States > California > Santa Clara County > Sunnyvale (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- Asia (0.04)